Combined first— principles calculation and neural— network correction approach as a 
powerful tool in computational physics and chemistry 
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Despite of their success, the results of first-principles quantum mechanical calculations contain 
inherent numerical errors caused by various approximations. We propose here a neural-network 
algorithm to greatly reduce these inherent errors. As a demonstration, this combined quantum me- 
chanical calculation and neural-network correction approach is applied to the evaluation of standard 
heat of formation Aff/* and standard Gibbs energy of formation AfG* for 180 organic molecules 
at 298 K. A dramatic reduction of numerical errors is clearly shown with systematic deviations 
being eliminated. For examples, the root-mean-square deviation of the calculated A#* (AG*) 
for the 180 molecules is reduced from 21.4 (22.3) kcal-mol -1 to 3.1 (3.3) kcal-mol -1 for B3LYP/6- 
311+G(d,p) and from 12.0 (12.9) kcal-mol -1 to 3.3 (3.4) kcal-mol" 1 for B3LYP/6-311+G(3rf/,2p) 
before and after the neural-network correction. 

PACS numbers: 31.15.Ew, 31.30.-i, 31,15,-p, 31.15.Ar 



One of the Holy Grails of computational science is 
to quantitatively predict properties of matters prior to 
experiments. Despite the facts that the first-principles 
quantum mechanical calculation 0, 0] has become an in- 
dispensable research tool and experimentalists have been 
increasingly relying on computational results to inter- 
pret their experimental findings, the practically used nu- 
merical methods by far are often not accurate enough, 
in particular, for complex systems. This limitation is 
caused by the inherent approximations adopted in the 
first-principles methods. Because of computational cost, 
electron correlation has always been a difficult obstacle 
for first-principles calculations. Finite basis sets cho- 
sen in practical computations are not able to cover en- 
tire physical space and this inadequacy introduces fur- 
ther inherent computational errors. Effective core po- 
tential is frequently used to approximate the relativistic 
effects, resulting inevitably in errors for systems that con- 
tain heavy atoms. The accuracy of a density-functional 
theory (DFT) calculation is mainly determined by the 
exchange-correlation (XC) functional being employed , 
whose exact form is however unknown. Nevertheless, the 
results of first-principles quantum mechanical calculation 
can capture the essence of physics. For instance, the cal- 
culated results, despite that their absolute values may 
poorly agree with measurements, are usually of the same 
tendency among different molecules as their experimen- 
tal counterpart. The quantitative discrepancy between 
the calculated and experimental results depends predom- 
inantly on the property of primary interest and, to a less 
extent, also on other related properties, of the material. 
There exists thus a sort of quantitative relation between 
the calculated and experimental results, as the aforemen- 
tioned approximations, to a large extent, contribute to 
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the systematic errors of specified first-principles methods. 
Can we develop general ways to eliminate the systematic 
computational errors and further to quantify the accu- 
racies of numerical methods used? It has been proven 
an extremely difficult task to determine the calculation 
errors from the first-principles. Alternatives must be 
sought. 

We propose here a neural-network algorithm to deter- 
mine the quantitative relationship between the experi- 
mental data and the first-principles calculation results. 
The determined relation will subsequently be used to 
eliminate the systematic deviations of the calculated re- 
sults, and thus, reduce the numerical uncertainties. Since 
its beginning in the late fifties, Neural Networks has been 
applied to various engineering problems, such as robotics, 
pattern recognition, speech, and etc. [E 3 As the first 
application of Neural Networks to quantum mechanical 
calculations of molecules, we choose the standard heat of 
formation Afff e and standard Gibbs energy of formation 
AfG"^ at 298.15 K as the properties of interest. 

A total of 180 small- or medium-sized organic 
molecules, whose Afff' & and AjG e values are well doc- 
umented in Refs. IE IE are selected to test our pro- 
posed approach. The tabulated values of AfiJ e and AfG^ 
in the three references differ less than 1.0 kcal-mol -1 
for same molecule. The uncertainties of all Afi7 e val- 
ues are less than 1.0 kcal-mol -1 , while those of AfG e s 
are not reported in Refs. IE IE These selected 
molecules contain elements such as H, C, N, O, F, Si, 
S, CI and Br. The heaviest molecule contains 14 heavy 
atoms, and the largest has 32 atoms. We divide these 
molecules randomly into the training set (150 molecules) 
and the testing set (30 molecules). The geometries of 180 
molecules are optimized via B3LYP/6-311+G(<i,p) @ 
calculations and the zero point energies (ZPEs) are cal- 
culated at the same level. The enthalpy and Gibbs en- 
ergy of each molecule are calculated at both B3LYP/6- 
311+G(<y>) and B3LYP/6-311+G(3(i/,2p). B3LYP/6- 
311+G(3d/,2jj) employs a larger basis set than B3LYP/6- 
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311+G(<2,p). The unsealed B3LYP/6-311+G(d,p) ZPE 
is employed in the /Sifl^ and AfG^ calculations. The 
strategies in reference are adopted to calculate Afff e 
and AG°. The calculated AfH e and AG^ for B3LYP/6- 
311+G(4p) and B3LYP/6-311+G(3d/,2p) are compared 
to their experimental counterparts in Figs. ^ and H re- 
spectively. The horizontal coordinates are the raw cal- 
culated data, and the vertical coordinates are the exper- 
imental values. The dashed lines are where the verti- 
cal and horizontal coordinates are equal, i.e., where the 
B3LYP calculations and experiments would have the per- 
fect match. The raw calculation values are mostly below 
the dashed line, i.e., most raw Afff" 9 " and AfG" 9 " are larger 
than the experimental data. In another word, there 
are systematic deviations for both B3LYP Afi/ e and 
AfG 9 ". Compared to the experimental measurements, the 
root-mean-square (RMS) deviations for /^H** (AfG" 9 ") 
are 21.4 (22.3) and 12.0 (12.9) teal-mol" 1 for B3LYP/6- 
3ll+G(d,p) and B3LYP/6-311+G(3d/,2p) calculations, 
respectively. In Table |U we compare the B3LYP and 
experimental AfiJ^s for 10 of 180 molecules. Overall, 
B3LYP/6-311+G(3<i/,2p) calculations yield better agree- 
ments with the experiments than B3LYP/6-311+G((i,p). 
In particular, for small molecules with few heavy el- 
ements B3LYP/6-311+G(3<i/,2p) calculations result in 
very small deviations from the experiments. For in- 
stance, the Afi7 e deviations for CH4 and CS2 are only 
-0.5 and 0.6 kcal-mol -1 , respectively. Our B3LYP/6- 
311+G(3rf/,2p) calculation results are also in good agree- 
ments with those of reference |9J which employed a similar 
calculation strategy except that their ZPEs were scaled 
by a factor of 0.98 or 0.96 and their geometries were op- 
timized at B3LYP/6-31+G(d). For large molecules, both 
B3LYP/6-311+G(d,p) and B3LYP/6-311+G(3rf/,2p) cal- 
culations yield quite large deviations from their experi- 
mental counterparts. 

Our neural network adopts a three-layer architecture 
which has an input layer consisted of input from the phys- 
ical descriptors and a bias, a hidden layer containing a 
number of hidden neurons, and an output layer that out- 
puts the corrected values for AfH & or AG° (see Fig. EJ - 
The number of hidden neurons is to be determined. The 
most important issue is to select the proper physical de- 
scriptors of our molecules, which are to be used as the 
input for our neural network. The calculated /SfH^ and 
AfG" 9 contain the essence of exact AfH^ and AG°, re- 
spectively, and are thus obvious choices of the primary 
descriptor for correcting AfH & and AfG 9 ', respectively. 
We observe that the size of a molecule affects the accu- 
racies of calculations. The more atoms a molecule has, 
the worse the calculated AfH & and AG & are. This is 
consistent with the general observations in the field. [jj 
The total number of atoms N t in a molecule is thus cho- 
sen as the second descriptor for the molecule. ZPE is 
an important parameter in calculating /SfH^ and AG e - 
Its calculated value is often scaled in evaluating Afff° 
and AG^, @ and it is thus taken as the third physical 
descriptor. Finally, the number of double bonds, Ndb, is 



TABLE I: Experimental and calculated ^#^(298 K) for ten 
selected compounds (all data are in the units of kcal-mol ) 



Deviations (Thcory-Expt.) 



Molecules 


Expt." 


DFT1° 


DFT1-NN C 


DFT2" 


DFT2-NN e 


DFT3 J 


CF2O 


-152.9±0.4 


20.0 


6.9 


8.7 


6.8 


9.1 


CH2CI2 


-22.8±0.3 


10.6 


3.6 


5.0 


4.!) 


4.6 


CH 2 F 2 


-108.1±0.2 


8.0 


0.9 


0.6 


0.6 


0.0 


CH 4 


-17.8±0.1 


1.1 


1.1 


-0.5 


1.0 


-1.6 


CS 2 


27.9±0.2 


8.7 


3.3 


0.6 


3.2 


0.2 


C5H12 


-35.1±0.2 


16.7 


-2.1 


9.9 


-2.2 




C 5 H 12 


-75.3±0.3 


23.2 


0.2 


14.0 


0.1 




C6H14 


-41.1 ±0.2 


25.1 


1.4 


17.0 


1.5 




CsHio 


4.6 ±0.3 


25.7 


0.5 


13.3 


1.0 




C9H12 


-2.3 ±0.3 


31.3 


0.9 


17.6 


1.8 





"The experimental values were taken from reference [j. 
6 The deviations of calculated AfH^ by using B3LPY/6- 
311+G(d,p) geometries, zero point energies and 
enthalpies. 

c The deviations of calculated AfH^ by B3LYP/6-311+G(rf,p)- 
Ncural Networks approach. 

d The deviations of calculated A{H^ by using the 6-311+G(d,p) 
geometries and zero point energies, and 

the calculated enthalpies with 6-311+G(3a!/,2p) basis. 

e The deviations of calculated Afff* by B3LYP/6-311+G(3d/,2p)- 
Neural Networks approach. 

^The deviations were taken from [jj , where the zero point energies 
were corrected by a scale factor. 

selected as the fourth and last descriptor to reflect the 
chemical structure of the molecule. 

To ensure the quality of our neural network, a cross- 
validation procedure is employed to determine our neural 
network. [T(j We divide further randomly 150 training 
molecules into five subsets of equal size. Four of them 
are used to train the neural network, and the fifth to val- 
idate its predictions. This procedure is repeated 5 times 
in rotation. The number of neurons in the hidden layer is 
varied from 2 to 10 to decide the optimal structure of our 
neural network. We find that the hidden layer contain- 
ing two neurons yields best overall results. Therefore, the 
5-2-1 structure is adopted for our neural network as de- 
picted in Fig. |3 The input values at the input layer, x\, 
x 2 , x 3 , x 4 and x 5 , are scaled Afff e (or AG^), N t , ZPE, 
Ndb and bias, respectively. The bias 2; 5 is set to 1. The 
weights {W^jjjs connect the input layer {xi} and the 
hidden neurons y\ and y 2 , and {Wyj}s connect the hid- 
den neurons and the output Z which is the scaled AfH & 
or AfG e upon neural-network correction. The output Z 
is related to the input {x^} as 

Z=J2 W Vj Si 9( E Wxij x % ), (1) 
3=1,2 1=1,5 

where Siq(v) — t-. -, c and a is a parameter 

that controls the switch steepness of Sigmoidal function 
Sig(v). An error back-propagation learning procedure 
is used to optimize the values of Wxij and Wyj(i = 
1,2,3,4,5 and j = 1,2). In Figs. [TJi, Hfc andEJi, the 
triangles belong to the training set and the crosses to the 
testing set. Compared to the raw calculated results, the 
neural-network corrected values are much closer to the 
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FIG. 1: Experimental ZW* versus calculated Ajff* for all 180 compounds, (a) and (b) are for raw B3LYP/6-311+G(rf,p) 
and B3LYP/6-311+G(3d/,2p) results, respectively, (c) and (d) are for neural-network corrected B3LYP/6-311+G(<i,p) and 
B3LYP/6-311+G(3c(/,2p) Afff*s, respectively. In (c) and (d), triangles are for the training set and crosses for the testing set. 
Inserts are the histograms for the differences between the experimental and calculated Af//*s. All values are in the units of 
kcal-mol -1 . 



experimental values for both training and testing sets. 
More importantly, the systematic deviations for AfiJ e 
and AfG" 9 " in Figs. QJi, QJd, [2K and[5j3 are eliminated, and 
the resulting numerical deviations are reduced substan- 
tially. This can be further demonstrated by the error 
analysis performed for the raw and neural-network cor- 
rected AfiJ^s and AfG^s of all 180 molecules. In the in- 
serts of Figs, ^and |21 we plot the histograms for the devi- 
ations (from the experiments) of the raw B3LYP Afff^s 
and AfG^s and their neural-network corrected values. 
Obviously, the raw calculated AfiJ^s and AG^s have 
large systematic deviations while the neural-network cor- 
rected Afi/°s and AG°s have virtually no systematic de- 
viations. Moreover, the remaining numerical deviations 
are much smaller. Upon the neural-network corrections, 
the RMS deviations of AfiJ^s (AfG e s) are reduced from 
21.4 (22.3) kcal-mol -1 to 3.1 (3.3) kcal-mol" 1 and 12.0 
(12.9) kcal-mol- 1 to 3.3 (3.4) kcal-mol" 1 for B3LYP/6- 
311+G(d,p) and B3LYP/6-311+G(3d/,2p), respectively. 
Note that the error distributions after the neural- 
network correction are of approximate Gaussian distribu- 
tions (see Figs.EJ; and Hi). Although the raw B3LYP/6- 
311+G(g?,p) results have much larger deviations than 
those of B3LYP/6-311+G(3d/, 2p), the neural-network 



corrected values of both calculations have deviations of 
the same magnitude. This implies that it is sufficient to 
employ the smaller basis set 6-311+G(c? 7 p) in our com- 
bined DFT calculation and neural network correction (or 
DFT-NEURON) approach. The neural-network algo- 
rithm can correct easily the deficiency of a small basis 
set. Therefore, the DFT-NEURON approach can po- 
tentially be applied to much larger systems. In Table ^ 
we also list the neural-network corrected /SfH^s of the 
10 molecules. The deviations of large molecules are of 
the same magnitude as those of small molecules. Unlike 
other quantum mechanical calculations that usually yield 
worse results for larger molecules than for small ones, the 
DFT-NEURON approach does not discriminate against 
the large molecules. 

Analysis of our neural network reveals that the weights 
connecting the input for /SfH^ or AG e have the domi- 
nant contribution in all cases. This confirms our fun- 
damental assumption that the calculated AfiJ^ (AG e ) 
captures the essential values of exact Afii^ (AG°). The 
input for the second physical descriptor, Nt, has quite 
large weights in all cases. In particular, when the smaller 
basis set 6-311+G(d,p) is adopted in the B3LYP calcu- 
lations, Nt has the second largest weights. It is found 
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Calculated a,G°(298K) / kcal-mol 



FIG. 2: Experimental AG* versus calculated AG* for all 180 compounds, (a) and (b) are for raw B3LYP/6-311+G(d,p) 
and B3LYP/6-311+G(3d/,2p) results, respectively, (c) and (d) are for neural-network corrected B3LYP/6-311+G(rf,p) and 
B3LYP/6-311+G(3rf/,2p) AG*s, respectively. Legends, units and inserts are similar to those of Fig.Q 



Input layer Hidden layer Output layer 




FIG. 3: Structure of our neural network. 



that the raw Af-ff e and AfG e deviations are roughly pro- 
portional to N t , which confirms the importance of N t 
as a significant descriptor of our neural network. The 
bias contributes to the correction of systematic devia- 
tions in the raw calculated data, and has thus significant 
weights. When the larger basis set 6-311+G(3c(/,2p) is 



used, the bias has the second largest weights for all cases. 
ZPE has been often scaled to account for the discrepan- 
cies of AfH^s or AfG^s between calculations and experi- 
ments, ;9] and it is thus expected to have large weights. 
This is indeed the case, especially when the smaller basis 
set 6-311+G((i,p) is adopted in calculations. In all cases 
the number of double bonds, Ndb, has the smallest but 
non-negligible weights. In Table |H] we list the values of 
{VFxy } and {VF?/;} of the two neural networks for cor- 
recting AfG^s of B3LYP/6-311+G(ci,p) and B3LYP/6- 
311+G(3<i/,2p) calculations. 

Our DFT-NEURON approach has a RMS deviation of 
~3 kcal-mol^ 1 for the 180 small- to medium-sized organic 
molecules. This is slightly larger than their experimental 
uncertainties. 0, IE The physical descriptors adopted 
in our neural network, the raw calculated Afff e or AfG e , 
the number of atoms Nt, the number of double bonds 
Ndb and the ZPE are quite general, and are not limited 
to special properties of organic molecules. The DFT- 
NEURON approach developed here is expected to yield 
a RMS deviation of ~3 kcal-mor 1 for Afff^s and AfG e s 
of any small- to medium-sized organic molecules. G2 
method |t| results are more accurate for small molecules. 
However, our approach is much more efficient and can be 
applied to much larger systems. To improve the accu- 
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TABLE II: Weights of DFT-Neural Networks for AfG* 



DFT1-NN" DFT2-NN 6 



Weights 


yi 


y2 


yi 


y2 


Wxij 


0.78 


-0.72 


0.83 


-0.73 


Wx 2j 


-0.60 


0.02 


-0.30 


0.02 


Wx 3i 


0.44 


0.02 


0.18 


0.02 


Wx4j 


0.07 


0.24 


0.05 


0.17 


Wx 5i 


-0.42 


-0.04 


-0.46 


0.01 


Wyj 


1.48 


-0.57 


1.44 


-0.47 



"DFT1-NN refers B3LYP/6-311+G(d,p)-Neural Networks ap- 
proach. 

6 DFT2-NN refers B3LYP/6-311+G(3d/,2p)-Neural Networks ap- 
proach. 



racy of the DFT-NEURON approach, we need more and 
better experimental data, and possibly, more and bet- 
ter physical descriptors for the molecules. Besides Afi7 & 
and AfG e , the DFT-NEURON approach can be gener- 
alized to calculate other properties such as ionization 
energy, dissociation energy, absorption frequency, band 
gap and etc. The raw first-principles calculation prop- 
erty of interest contains its essential value, and is thus 
always the primary descriptor. Since the raw calcula- 
tion error accumulates as the molecular size increases, 
the number of atoms N t should thus be selected as a de- 
scriptor for any DFT-NEURON calculations. Additional 
physical descriptors should be chosen according to their 
relations to the property of interest and to the physical 
and chemical structures of the compounds. Others have 
used Neural Networks to determine the quantitative rela- 
tionship between the experimental thermodynamic prop- 



erties and the structure parameters of the molecules. [l(J 
We distinct our work from others by utilizing specifi- 
cally the first-principles methods and with the objective 
to improve quantum mechanical results. Since the first- 
principles calculations capture readily the essences of the 
properties of interest, our approach is more reliable and 
covers much a wider range of molecules or compounds. 

To summarize, we have developed a promising new 
approach to improve the results of first-principles quan- 
tum mechanical calculations and to calibrate their un- 
certainties. The accuracy of DFT-NEURON approach 
can be systematically improved as more and better ex- 
perimental data are available. As the systematic de- 
viations caused by small basis sets and less sophisti- 
cated methods adopted in the calculations can be eas- 
ily corrected by Neural Networks, the requirements on 
first-principles calculations are modest. Our approach 
is thus highly efficient compared to much more sophisti- 
cated first-principles methods of similar accuracy, and 
more importantly, is expected to be applied to much 
larger systems. The combined first-principles calcula- 
tion and neural-network correction approach developed 
in this work is potentially a powerful tool in computa- 
tional physics and chemistry, and may open the possibil- 
ity for first-principles methods to be employed practically 
as predictive tools in materials research and design. 
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